Minimal Test Collections for Relevance Feedback
نویسندگان
چکیده
The Information Retrieval Lab at the University of Delaware participated in the Relevance Feedback track at TREC 2009. We used only the Category B subset of the ClueWeb collection; our preprocessing and indexing steps are described in our paper on ad hoc and diversity runs [10]. The second year of the Relevance Feedback track focused on selection of documents for feedback. Our hypothesis is that documents that are good at distinguishing systems in terms of their effectiveness by mean average precision will also be good documents for relevance feedback. Thus we have applied the document selection algorithm MTC (Minimal Test Collections) developed by Carterette et al. [6, 4, 9, 5] that is used in the Million Query Track [2, 1, 8] for selecting documents to be judged to find the right ranking of systems. Our approach can therefore be described as “MTC for Relevance Feedback”.
منابع مشابه
Parsimonious Relevance Models for Multiple Corpora
We describe a method for applying parsimonious language models to re-estimate the term probabilities assigned by relevance models. We apply our method to six topic sets from test collections in five different genres. Our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most ...
متن کاملToshiba BRIDJE at NTCIR-6 CLIR
At NTCIR-6 CLIR, Toshiba participated in the Monolingual and Bilingual IR tasks covering three topic languages (Japanese, English and Chinese) and one document language (Japanese). For Stage 1 (which is the usual ad hoc task using the new NTCIR6 topics), we submitted two DESCRIPTION runs and two TITLE runs for each topic language. Our first search strategy is Selective Sampling with Memory Rese...
متن کاملToshiba BRIDJE at NTCIR-6 CLIR: The Head/Lead Method and Graded Relevance Feedback
At NTCIR-6 CLIR, Toshiba participated in the Monolingual and Bilingual IR tasks covering three topic languages (Japanese, English and Chinese) and one document language (Japanese). For Stage 1 (which is the usual ad hoc task using the new NTCIR6 topics), we submitted two DESCRIPTION runs and two TITLE runs for each topic language. Our first search strategy is Selective Sampling with Memory Rese...
متن کاملFlexible Pseudo-Relevance Feedback via Direct Mapping and Categorization of Search Requests
This paper explores various strategies for enhancing the reliability of pseudo-relevance feedback using TREC and NTCIR test collections. For each test request, the number of pseudo-relevanct documents ( ) or the number of expansion terms ( ) is determined based on a similar training request (i.e. via direct mapping) or a group of similar training requests (i.e. via categorization). The results ...
متن کاملOn the Evaluation of the Quality of Relevance Assessments Collected through Crowdsourcing
Established methods for evaluating information retrieval systems rely upon test collections that comprise document corpora, search topics, and relevance assessments. Building large test collections is, however, an expensive and increasingly challenging process. In particular, building a collection with a sufficient quantity and quality of relevance assessments is a major challenge. With the gro...
متن کامل